Cost-eeective Data-parallel Load Balancing Cost-eeective Data-parallel Load Balancing
نویسندگان
چکیده
Load balancing algorithms improve a program's performance on unbalanced datasets, but can degrade performance on balanced datasets, because unnecessary load redistributions occur. This paper presents a cost-eeective data-parallel load balancing algorithm which performs load redistributions only when the possible savings outweigh the redistribution costs. Experiments with a data-parallel polygon renderer show a performance improvement of up to a factor of 33 on unbalanced datasets and a maximum performance loss of only 27 percent on balanced datasets when using this algorithm.
منابع مشابه
PLUM: Parallel Load Balancing for Adaptive Unstructured Meshes
Mesh adaption is a powerful tool for eecient unstructured-grid computations but causes load imbalance among processors on a parallel machine. We present a novel method called PLUM to dynamically balance the processor workloads with a global view. This paper describes the implementation and integration of all major components within our dynamic load balancing strategy for adaptive grid calculati...
متن کاملExploiting Partial Replication in Unbalanced Parallel Loop Scheduling on Multicomputers
We consider the problem of scheduling parallel loops whose iterations operate on large array data structures and are characterized by highly varying execution times (unbalanced or non-uniform parallel loops). A general parallel loop implementation template for message-passing distributed-memory multiprocessors (multicomputers) is presented. Assuming that it is impossible to statically determine...
متن کاملEfficient Algorithms for Data Distribution on Distributed Memory Parallel Computers
Data distribution has been one of the most important research topics in parallelizing compilers for distributed memory parallel computers. Good data distribution schema should consider both the computation load balance and the communication overhead. In this paper, we show that data redistribution is necessary for executing a sequence of Do-loops if the communication cost due to performing this...
متن کاملParallel Computing and Domain Decomposition
Domain decomposition techniques appear a natural way to make good use of parallel computers. In particular, these techniques divide a computation into a local part, which may be done without any interprocessor communication, and a part that involves communication between neighboring and distant processors. This paper discusses some of the issues in designing and implementing a parallel domain d...
متن کاملRegion Analysis: A Parallel Elimination Method for Data Flow Analysis
| Parallel data ow analysis methods ooer the promise of calculating detailed semantic information about a program at compile-time more eeciently than sequential techniques. Previous work on parallel elimination methods 1, 2] has been hampered by the lack of control over interval size; this can prohibit eeective parallel execution of these methods. To overcome this problem, we have designed the ...
متن کامل